Character-aware Attention Residual Net- Work for Sentence Representation
نویسندگان
چکیده
Text classification in general is a well studied area. However, classifying short and noisy text remains challenging. Feature sparsity is a major issue. The quality of document representation here has a great impact on the classification accuracy. Existing methods represent text using bag-of-word model, with TFIDF or other weighting schemes. Recently word embedding and even document embedding are proposed to represent text. The purpose is to capture features at both word level and sentence level. However, the character level information are usually ignored. In this paper, we take word morphology and word semantic meaning into consideration, which are represented by character-aware embedding and word distributed embedding. By concatenating both character-level and word distributed embedding together and arranging words in order, a sentence representation matrix could be obtained. To overcome data sparsity problem of short text, sentence representation vector is then derived based on different views from sentence representation matrix. The various views contributes to the construction of an enriched sentence embedding. We employ a residual network on the sentence embedding to get a consistent and refined sentence representation. Evaluated on a few short text datasets, our model outperforms state-of-the-art models.
منابع مشابه
Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition
In this paper, we present a Character-Aware Neural Network (Char-Net) for recognizing distorted scene text. Our CharNet is composed of a word-level encoder, a character-level encoder, and a LSTM-based decoder. Unlike previous work which employed a global spatial transformer network to rectify the entire distorted text image, we take an approach of detecting and rectifying individual characters....
متن کاملRemuneration of Non-Executive Independent Directors Review the View of Representation Theory
Management is trying to maximize your rewards and that means in terms of net profit, return on investment (performance) or other accounting measures and also by trying to Making positive changes in the prices of corporate securities to be done. In other words, the maximum managers Their interests are trying to improve corporate performance and the improvement of the capital Investors will be aw...
متن کاملCompositional Sentence Representation from Character Within Large Context Text
In this work, we targeted two problems of representing a sentence on the basis of a constituent word sequence: a data-sparsity problem in non-compositional word embedding, and no usage of inter-sentence dependency. To improve these two problems, we propose a Hierarchical Composition Recurrent Network (HCRN), which consists of a hierarchy with 3 levels of compositional models: character, word an...
متن کاملCharacter-Based Text Classification using Top Down Semantic Model for Sentence Representation
Despite the success of deep learning on many fronts especially image and speech, its application in text classification often is still not as good as a simple linear SVM on n-gram TF-IDF representation especially for smaller datasets. Deep learning tends to emphasize on sentence level semantics when learning a representation with models like recurrent neural network or recursive neural network,...
متن کاملFrom Deep to Shallow: Transformations of Deep Rectifier Networks
In this paper, we introduce transformations of deep rectifier networks, enabling the conversion of deep rectifier networks into shallow rectifier networks. We subsequently prove that any rectifier net of any depth can be represented by a maximum of a number of functions that can be realized by a shallow network with a single hidden layer. The transformations of both deep rectifier nets and deep...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017